DROPS

Document

Extreme Classification (Dagstuhl Seminar 18291)

Authors: Samy Bengio, Krzysztof Dembczynski, Thorsten Joachims, Marius Kloft, and Manik Varma

Published in: Dagstuhl Reports, Volume 8, Issue 7 (2019)

Abstract

Extreme classification is a rapidly growing research area within machine learning focusing on multi-class and multi-label problems involving an extremely large number of labels (even more than a million). Many applications of extreme classification have been found in diverse areas ranging from language modeling to document tagging in NLP, face recognition to learning universal feature representations in computer vision, gene function prediction in bioinformatics, etc. Extreme classification has also opened up a new paradigm for key industrial applications such as ranking and recommendation by reformulating them as multi-label learning tasks where each item to be ranked or recommended is treated as a separate label. Such reformulations have led to significant gains over traditional collaborative filtering and content-based recommendation techniques. Consequently, extreme classifiers have been deployed in many real-world applications in industry. Extreme classification has raised many new research challenges beyond the pale of traditional machine learning including developing log-time and log-space algorithms, deriving theoretical bounds that scale logarithmically with the number of labels, learning from biased training data, developing performance metrics, etc. The seminar aimed at bringing together experts in machine learning, NLP, computer vision, web search and recommendation from academia and industry to make progress on these problems. We believe that this seminar has encouraged the inter-disciplinary collaborations in the area of extreme classification, started discussion on identification of thrust areas and important research problems, motivated to improve the algorithms upon the state-of-the-art, as well to work on the theoretical foundations of extreme classification.

Cite as

Samy Bengio, Krzysztof Dembczynski, Thorsten Joachims, Marius Kloft, and Manik Varma. Extreme Classification (Dagstuhl Seminar 18291). In Dagstuhl Reports, Volume 8, Issue 7, pp. 62-80, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@Article{bengio_et_al:DagRep.8.7.62,
  author =	{Bengio, Samy and Dembczynski, Krzysztof and Joachims, Thorsten and Kloft, Marius and Varma, Manik},
  title =	{{Extreme Classification (Dagstuhl Seminar 18291)}},
  pages =	{62--80},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2019},
  volume =	{8},
  number =	{7},
  editor =	{Bengio, Samy and Dembczynski, Krzysztof and Joachims, Thorsten and Kloft, Marius and Varma, Manik},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagRep.8.7.62},
  URN =		{urn:nbn:de:0030-drops-101739},
  doi =		{10.4230/DagRep.8.7.62},
  annote =	{Keywords: algorithms and complexity, artificial intelligence, computer vision, machine learning}
}

Document

DOI: 10.4230/DagRep.5.4.18

Machine Learning with Interdependent and Non-identically Distributed Data (Dagstuhl Seminar 15152)

Authors: Trevor Darrell, Marius Kloft, Massimiliano Pontil, Gunnar Rätsch, and Erik Rodner

Published in: Dagstuhl Reports, Volume 5, Issue 4 (2015)

Abstract

One of the most common assumptions in many machine learning and data analysis tasks is that the given data points are realizations of independent and identically distributed (IID) random variables. However, this assumption is often violated, e.g., when training and test data come from different distributions (dataset bias or domain shift) or the data points are highly interdependent (e.g., when the data exhibits temporal or spatial correlations). Both scenarios are typical situations in visual recognition and computational biology. For instance, computer vision and image analysis models can be learned from object-centric internet resources, but are often rather applied to real-world scenes. In computational biology and personalized medicine, training data may be recorded at a particular hospital, but the model is applied to make predictions on data from different hospitals, where patients exhibit a different population structure. In the seminar report, we discuss, present, and explore new machine learning methods that can deal with non-i.i.d. data as well as new application scenarios.

Cite as

Trevor Darrell, Marius Kloft, Massimiliano Pontil, Gunnar Rätsch, and Erik Rodner. Machine Learning with Interdependent and Non-identically Distributed Data (Dagstuhl Seminar 15152). In Dagstuhl Reports, Volume 5, Issue 4, pp. 18-55, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@Article{darrell_et_al:DagRep.5.4.18,
  author =	{Darrell, Trevor and Kloft, Marius and Pontil, Massimiliano and R\"{a}tsch, Gunnar and Rodner, Erik},
  title =	{{Machine Learning with Interdependent and Non-identically Distributed Data (Dagstuhl Seminar 15152)}},
  pages =	{18--55},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2015},
  volume =	{5},
  number =	{4},
  editor =	{Darrell, Trevor and Kloft, Marius and Pontil, Massimiliano and R\"{a}tsch, Gunnar and Rodner, Erik},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagRep.5.4.18},
  URN =		{urn:nbn:de:0030-drops-53497},
  doi =		{10.4230/DagRep.5.4.18},
  annote =	{Keywords: machine learning, computer vision, computational biology, transfer learning, domain adaptation}
}

Search Results

Documents authored by Kloft, Marius

Extreme Classification (Dagstuhl Seminar 18291)

Abstract

Cite as

Machine Learning with Interdependent and Non-identically Distributed Data (Dagstuhl Seminar 15152)

Abstract

Cite as

Thanks for your feedback!

Could not send message